Time-Bounded Best-First Search

نویسندگان

  • Carlos Hernández
  • Roberto Javier Asín Achá
  • Jorge A. Baier
چکیده

Time-Bounded A* (TBA*) is a single-agent deterministic search algorithm that expands states of a graph in the same order as A* does, but that unlike A* interleaves search and action execution. Although the idea underlying TBA* can be generalized to other single-agent deterministic search algorithms, little is known about the impact on performance that would result from using algorithms other than A*. In this paper we propose Time-Bounded Best-First Search (TBBFS) a generalization of the time-bounded approach to any best-first search algorithm. Furthermore, we propose restarting strategies that allow TB-BFS to solve search problems in dynamic environments. In static environments, we prove that the resulting framework allows agents to always find a solution if such a solution exists, and prove cost bounds for the solutions returned by Time-Bounded Weighted A* (TBWA*). We evaluate the performance of TB-WA* and TimeBounded Greedy Best-First Search (TB-GBFS). We show that in pathfinding applications in static domains, TB-WA* and TB-GBFS are not only faster than TBA* but also find significantly better solutions in terms of cost. In the context of videogame pathfinding, TB-WA* and TB-GBFS perform fewer undesired movements than TBA*. Restarting TB-WA* was also evaluated in dynamic pathfinding random maps, where we also observed improved performance compared to restarting TBA*. Our experimental results seem consistent with theoretical bounds. Introduction In many search applications, time is a very scarce resource. Examples range from video game path finding, where a handful of milliseconds are given to the search algorithm controlling automated characters (Bulitko et al. 2011), to highly dynamic robotics (Schmid et al. 2013). In those settings, it is usually assumed that a standard search algorithm will not be able to compute a complete solution before an action is required, and thus execution and search must be interleaved. Time-Bounded A* (TBA*) (Björnsson, Bulitko, and Sturtevant 2009) is an algorithm suitable for searching under tight time constraints. In a nutshell, given a parameter k, it runs a standard A* search towards the goal rooted in the initial state, but after k expansions are completed a move Copyright c © 2014, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. is performed and then search, if still needed, is resumed. It terminates when the agent has reached the goal. TBA* is among the fastest algorithms for real-timeconstrained applications. In fact, Hernández et al. (2012) showed it significantly outperforms state-of-the-art realtime heuristic search algorithms such as RTAA* (Koenig and Likhachev 2006) and daRTAA* (Hernández and Baier 2012). In this paper we extend the time-bounded search approach to a broader class of algorithms. Specifically, we propose Time-Bounded Best-First Search (TB-BFS), which is a family of time-bounded algorithms suitable for static environments. In addition, we propose two restart strategies—eager and lazy restart—that allow TB-BFS to solve search problems in dynamic domains, in which action costs may change during execution. Furthermore, we propose measures of quality inspired by videogame applications that go beyond solution cost and aim at capturing when solutions “look good” to an observer. We focus on two instances of TB-BFS: Time-Bounded Weighted A* (TB-WA*) and Time-Bounded Greedy BestFirst Search (TB-GBFS). Theoretically, we show that TBBFS is complete, and establish an upper bound on the cost of solutions returned by TB-WA* in static domains. Our cost bound establishes that in some domains solution cost may be reduced significantly by increasing w; hence, in contrast to Weighted A*, we might obtain better solutions by increasing the weight. This result is important since it suggests that TBWA* (with w > 1) should be preferred to TBA* in domains in which WA* runs faster than A*. While WA* does not always run faster than A* (see e.g., Wilt and Ruml, 2012), it is known that it does in many domains. Experimentally, we evaluate the two algorithms on pathfinding benchmarks in static terrain and show that increasing w allows finding significantly better solutions in less time. In terms of our videogame-inspired quality measures TB-GBFS seems to be superior in static domains. In dynamic domains we evaluate TB-WA* on random maps and, yet again, we show that the use of weights greater than one allows improving performance rather significantly. We also show that lazy restart is superior to eager restart. The rest of the paper is organized as follows. We start off describing the background needed for the rest of the paper. Then we describe TB-BFS for static and dynamic domains and carry out a theoretical analysis. This is followed by a description of our videogame-inspired measures of quality. Then we describe the experimental results, and finish with a summary and perspectives for future research. Background Below we describe the background for the rest of the paper. Search in Static and Dynamic Environments A search graph is a tuple G = (S,A), where S is a set of states (vertices),A ⊆ S×S is a set of edges which represent the actions available to the agent in each state. A path over graph (S,A) is a sequence of states π = s0s1 · · · sn, where (si, si+1) ∈ A, for all i ∈ {0, . . . , n− 1}. A cost function c for a search graph (S,A) is such that c : A → R ∪ {∞}; i.e., it associates an action with a positive cost. The cost of a path π = s0s1 · · · sn is ∑n−1 i=0 c(si, si+1), i.e. the sum of the costs of each edge considered in the path. Given c, we say that t is a successor of s if (s, t) is an edge in A with finite cost; moreover, for every s ∈ S we define Succc(s) = {t | (s, t) ∈ A, c(s, t) 6= ∞}. Thus, our framework allows two alternatives for representing that state t is not a successor of state s: either by saying (s, t) 6∈ A or by defining c(s, t) = ∞. As we see later, this will help us with the definition of dynamic environments. In this paper we assume the search graph is undirected, which informally means that every action is reversible. This restriction is true in many interesting search problems, but we need it here because TBA* and hence TB-BFS may have the agent undo previously performed actions—a process known as physical backtracking. We focus on search problems in two general settings: static and dynamic environments. A search problem in a static environment is a tuple P = (S,A, c, sstart , sgoal), where G = (S,A) is a search graph, c a cost function and sstart , sgoal ∈ S are, respectively, the initial and the goal state. The problem is to compute a path π = s0s1 · · · sn such that s0 = sstart , sn = sgoal , and such that si+1 ∈ Succc(si), for all i ∈ {0, . . . , n− 1}. For dynamic environments we assume that after the agent moves, the costs of the arcs may change. Given a tuple, P = (S,A, γ, sstart , sgoal) where S, A, sstart , and sgoal are as above and γ = c0c1 · · · is an infinite sequence of cost functions over (S,A), the problem is to compute a path π = s0s1 · · · sn such that s0 = sstart , sn = sgoal , and such that si+1 ∈ Succci(si), for all i ∈ {0, . . . , n − 1}. Observe that given the way we define Succci , states can be disconnected or reconnected to the search space as the agent executes actions. Best-First Search Best-First Search (Pearl 1984) captures a family of search algorithms for static environments which associate a priority p(s) with every state s. The priority is computed using an evaluation function, f , that is such that a lower value to states that are viewed as closer to the goal. The algorithm starts off by initializing the priority of all nodes in search space to infinity, except for sstart , for which the priority is set to f(sstart). A priority queue Open is initialized as containing sstart . In each iteration, the algorithm extracts from Open the state with lowest priority, s. For each successor t of s it computes the evaluation f(t). If f(t) is lower than p(t), then t is added to Open and p(t) is set to f(t). The algorithm repeats this process until sgoal is in Open with the lowest priority. An instance of Best-First Search is Weighted A* (WA*) (Pohl 1970). Its evaluation function is defined as f(s) = g(s) + wh(s), where g(s) is the cost of a path from sstart to s, h is a user-given heuristic function such that h(s) estimates the cost of a path from s to sgoal , and w is a real number greater than or equal to 1. Note that the way g(s) is computed here depends on the particular path discovered to s. Throughout execution state s may be re-discovered multiple times and hence receive different g-values. Function h is admissible when h(s) does not overestimate the cost of an optimal path from s to sgoal , for all s ∈ S. If h is admissible WA* is known to find a solution whose cost cannot exceed wc∗, where c∗ is the cost of a shortest path from sstart to sgoal . As such, WA* may return increasingly worse solutions as w is increased. The advantage of increasing w is that execution is also faster. When w = 1, WA* is equivalent to A* (Hart, Nilsson, and Raphael 1968). Another instance of Best-First Search is Greedy Best-First Search (GBFS). Here f is equal to the user-given heuristic function h. When w is very large GBFS is similar to WA* but not equivalent; indeed, in both algorithms search is mainly driven by h but in WA* g(n) winds up acting as a tiebreaker.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Time-Bounded Best-First Search for Reversible and Non-reversible Search Graphs

Time-Bounded A* is a real-time, single-agent, deterministic search algorithm that expands states of a graph in the same order as A* does, but that unlike A* interleaves search and action execution. Known to outperform state-of-the-art real-time search algorithms based on Korf’s Learning Real-Time A* (LRTA*) in some benchmarks, it has not been studied in detail and is sometimes not considered as...

متن کامل

Depth-First vs. Best-First Search: New Results

Best-first search (BFS) expands the fewest nodes among all admissible algorithms using the same cost function, but typically requires exponential space. Depth-first search needs space only linear in the maximumsearch depth, but expands more nodes than BFS. Using a random tree, we analytically show that the expected number of nodes expanded by depth-first branch-and-bound (DFBnB) is no more than...

متن کامل

Bounded Suboptimal Search in Linear Space: New Results

Bounded suboptimal search algorithms are usually faster than optimal ones, but they can still run out of memory on large problems. This paper makes three contributions. First, we show how solution length estimates, used by the current stateof-the-art linear-space bounded suboptimal search algorithm Iterative Deepening EES, can be used to improve unboundedspace suboptimal search. Second, we conv...

متن کامل

A* Search for Soft Constraints Bounded by Tree Decompositions

Some of the most efficient methods for solving soft constraints are based on heuristic search using an evaluation function that is mechanically generated from the problem. However, if only a few best solutions are needed, significant effort can be wasted pre-computing heuristics that are not used during search. Recently, a scheme for depth-first branch-and-bound search has been proposed that av...

متن کامل

Heuristic Search with Limited Memory By

HEURISTIC SEARCH WITH LIMITED MEMORY by Matthew Hatem University of New Hampshire, May, 2014 Heuristic search algorithms are commonly used for solving problems in artificial intelligence. Unfortunately, the memory requirement of A*, the most widely used heuristic search algorithm, is often proportional to its running time, making it impractical for large problems. Several techniques exist for s...

متن کامل

BnB-ADOPT: an asynchronous branch-and-bound DCOP algorithm

Distributed constraint optimization (DCOP) problems are a popular way of formulating and solving agent-coordination problems. A DCOP problem is a problem where several agents coordinate their values such that the sum of the resulting constraint costs is minimal. It is often desirable to solve DCOP problems with memory-bounded and asynchronous algorithms. We introduce Branch-and-Bound ADOPT (BnB...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014